On phase transition in 
the hard-core model on Z d 

David Galvin 
Department of Mathematics 

Rutgers University 
New Brunswick, NJ 08903 
and 
Jeff Kahn* 

Department of Mathematics and RUTCOR 
Rutgers University 
New Brunswick, NJ 08903 



Abstract 

It is shown that the hard-core model on Z d exhibits a phase tran- 
sition at activities above some function \(d) which tends to zero as 
d — > oo; that is: 

Consider the usual nearest neighbor graph on Z d , and write £ and 
O for the sets of even and odd vertices (defined in the obvious way). 
Set 

K M = At/ = {z G Z d : j^U < M}, d*A M = {z G Z d : H^U = M}, 

and write I(Am) for the collection of independent sets (sets of vertices 
spanning no edges) in Am- For A > let I be chosen from I(Am ) with 
Pr(I = I) oc Al J L 

Theorem There is a constant C such that if A > Cd _1//4 log 3//4 d, then 
lim Pr(0 GIIID d*A M n£) > lim Pr(0 G III D d*A M n O). 



Thus, roughly speaking, the influence of the boundary on behavior at 
the origin persists as the boundary recedes. 
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1 Introduction 



The "hard-core model" is a simple mathematical model of a gas with par- 
ticles of non-negligible size. The vertices ("sites") of a graph are regarded 
as positions, each of which can be occupied by a particle, subject to the 
rule that two neighboring sites cannot both be occupied (particles cannot 
overlap) . 

We need a few definitions, but aim to be brief. For good introductions to 
the hard-core model see [1], [10]. See also [8] for more general background, 
and e.g. [2] or [5] for graph theory basics. A few conventions are mentioned 
at the end of this section. 

Write X(E) for the collection of independent sets (sets of vertices spanning 
no edges) of graph S. 

For E finite and A > 0, the hard-core measure with activity (or fugacity) 
A on X = X(E) (or "on E") is given by 

= X lIl /Z for 7 GX, 

where Z is the appropriate normalizing constant (partition function), Z = 
^{A^'l : I' G X}. (The more usual etiquette here considers probability 
measures on {0, l} y ( s ) supported on indicators of independent sets; but the 
present usage is convenient for us, and we adhere to it throughout.) 

In particular A = 1 gives uniform distribution. One may also assign 
different activities A^ to the different vertices v and take proportional 
to n ve i X v , but we will not do so here; again see [1], [10], and also e.g. [14], 
[11], [13] for some combinatorial applications. 

For infinite E a measure /i on X(E) is hard-core with activity A if, for I 
chosen according to jj, and for each finite W C V = V(Ti), the conditional 
distribution of I fl W given I fl (V \ W) is /i-a.s. the hard-core measure with 
activity A on the independent sets of {w G W : w ^ I fl (V \ W)} (the 
vertices that can still be in I given In (V \ W)). General considerations (see 
[8]) imply that there is always at least one such fi; if there is more than one, 
the model is said to have a phase transition. 

The canonical (and by far most studied) case of the hard-core model is 
that of (the usual nearest neighbor graph on) Z d . Here the seminal result 
is due to Dobrushin [6], who proved that there is a phase transition for 
sufficiently large A, depending on d. (Dobrushin's result was rediscovered by 
Louth [18] in the context of communications networks.) 
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The A required in [6] is larger than one would expect, ^ and attempted im- 
provements have been the subject of considerable effort — if not publication — 
in both the statistical mechanics and discrete mathematics communities in 
recent years. 

Even the fact that the required A increases with d is a little strange, 
since one expects that as d grows phase transition should get "easier," in the 
sense that for a given A, phase transition in dimension d should imply phase 
transition in all higher dimensions; but this remains open. 

Also open is the existence of a "critical" activity, X c (d), such that one 
has phase transition for A > X c (d) but not for A < \ c (d). While this seems 
certain to be true for Z d , a cautionary note is sounded in [4], where it is 
shown that there are graphs (even trees) for which there is no such critical 
activity. 

As a temporary substitute we may define X(d) to be the supremum of 
those A for which the hard-core model with activity A on Z d does not have a 
phase transition. 

So Dobrushin at least tells us that X(d) < oo, while "easier as dimension 
grows" would imply X(d) < 0(1). A particular question that has received 
much of the attention devoted to this problem is whether X(d) < 1 for large 
d. But in fact it has been generally believed (despite some early guesses to 
the contrary) that X(d) tends to zero as d grows; this is what we prove: 

Theorem 1.1 X(d) = 0(d^ 4 log 3/4 d). 

The bound here is undoubtedly not best possible; 0(\ogd/d) and 0(1 /d) are 
natural guesses at the true value of X(d). 

We assume henceforth that d is large enough to support our various as- 
sertions. 

The problem of showing existence of a phase transition may be finitized 
as follows. Let A = A M = Z rf n [-M, M] d = OU£ with O and £ the sets 
of odd and even vertices (defined in the natural way: x G Z d is odd if Y, x% 
is odd); let hm be the hard-core measure with activity A on A (meaning, of 
course, on the subgraph of Z d induced by A); and (with I chosen according 

tNo explicit bound is given in [6], but several colleagues report that Dobrushin's argu- 
ment works for A > C d for a suitable constant C. 
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to (im) let fi e M be fiM conditioned on the event {I D d*A n £}, where d*A := 
[-M, M] d \ [— (M — 1), M — l] d , and define n° M similarly. 

In [1] it is shown {inter alia) that the sequences {fi e M } and {[x° M } converge 
to weak limits, called /i e and fi°, and that there is a phase transition iff these 
limits are different. (This is mainly based on the FKG Inequality, and applies 
to general bipartite graphs E, provided we allow {Am} to be an arbitrary 
nested sequence with UA^ = V(E).) 

Thus it is natural to try to prove phase transition by exhibiting some 
statistic distinguishing /i e from We will show //(O el) ^ fi°(0 G I), i.e. 

hm A(Oel)^ lim A(Oel). (1) 

M— >oo M->oo 

(Of course we are only using the trivial direction of "phase transition iff // ^ 
fi°." It is not hard to show that (1), too, is equivalent to phase transition.) 

To establish (1) (assuming at least A = Q(l/d), which is easily seen to be 
necessary for phase transition) it is in turn enough to show that for v G A, 

Vm( v o) < o(l/d) if v is odd, 
A'm^o) < o(l/d) if v is even. 

For then (writing N for neighborhood) 



/i e M (QGi) = ^(JV(0)ni = 0)^(0 ei|JV(Q)ni = 0) 

= (l-o(l))A/(l + A), 

so that // e (0 g I) = (1 - o(l))A/(l + A), whereas fi°{0 G I) = o(l/d). 

So in particular the next theorem, whose proof is the main business of 
this paper, contains Theorem 1.1. 

Theorem 1.2 For 

\ = oj(d' 1 /Hog 3/4 d), (2) 
M arbitrary, and v an odd vertex of A M , 

fi e M (v G I) < (1 + A)-^ 1 ^. (3) 

The same result holds if we reverse the roles of even and odd. 
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Remark. It is easy to see that 



fi e M (v el) = fi e M (N(v )nl = ®)fi e M (v ei\N(v )nl = ®) 

so that (3) actually gives the asymptotics of \ogfi e M (v e I). 
Set 

J = {I e 1(A) : <9*An£ c/}. 

The proof of Theorem 1.2 is a sort of "Peierls argument" (see e.g. [9]): we 
try to associate with each / e J containing v a "contour" — some kind 
of membrane separating the outer even region from an inner odd region 
containing vq — and then use this to map I to a large set of J's, also from 
J but not containing vq, each obtained from I by some modification of the 
inner region. 

This is no surprise: almost every attempt at settling this problem that 
we're aware of has attacked it more or less along these lines. (The one 
exception is the entropy approach of [12], which for now seems unlikely to 
get us to anything like what's proved here.) 

The main difficulty in all these attempts has been getting some kind of 
control over the set of possible "contours." Much of the inspiration for our ap- 
proach to this problem was provided by the beautiful ideas of A. Sapozhenko 
[20], which he used to give, for example, relatively simple derivations of Kor- 
shunov's [16] description of the asymptotics for Dedekind's Problem (in [22]), 
and, in [21], of the asymptotics for the number of independent sets ("codes 
of distance 2") in the Hamming cube {0, l} n originally established in [17]. 

Some of our tools also come from [20]: Lemma 2.17 is an improved version 
of one of Sapozhenko's arguments, and our uses of Lemmas 2.1-2.3 are similar 
to his. 

The rest of the paper is devoted to the proof of Theorem 1.2. Unfortu- 
nately, saying anything even mildly intelligible about the argument turns out 
to be awkward without some preliminaries, so we will wait: see the end of 
Section 2.2 and most of Section 2.6. (Section 2.2 reformulates slightly and 
says what we will actually prove.) 
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Usage 

We use "bigraph" for "bipartite graph." 

For a graph on vertex set V, we use V(W) for the set of edges having 
exactly one end in W C V and V(£7, W) for the set of edges having one end 
in U and the other in W. 

The neighborhood of (i.e. set of vertices adjacent to) v is N(v); N(W) = 
U{N(v) : v G W}; and dW = N(W) \ W. We use d(-) for degree— d{v) = 
\N(v)\ and d w (v) = \N(v) D W\ — and dist(-, •) for distance. 

One common abuse: we often fail to distinguish between a graph and its 
set of vertices, so for instance might use "component" where we should really 
say "set of vertices of a component." 

When the difference makes no difference, we pretend that all large num- 
bers are integers. All constants implied by the notations O(-), fl(-) are ab- 
solute; that is, they do not depend on d. 

2 Proof of Theorem 1.2 
2.1 Preliminaries 

Here we collect what we will need in the way of known results. 

Lemma 2.1 In any graph with all degrees at most D, the number of con- 
nected, induced subgraphs of order n containing a fixed vertex xo is at most 
(eD) n . 

This follows from the well-known fact (e.g. [15, p. 396, Ex.11]) that the infi- 
nite .D-branching rooted tree contains precisely ( - D ^ 1 1 ) TO+1 (^ n ) rooted subtrees 
of size n. 

The next lemma is a special case of a fundamental result due to Lovasz 
[19] and Stein [23] (see also [7]). For a bigraph E with bipartition X UY, 
say y'CY covers X if each iGl has a neighbor in Y' . 

Lemma 2.2 If E as above satisfies d(x) > a Vrr G X and d(y) < b G Y , 
then X is covered by some Y' CY of size at most (|Y|/a)(l + ln&). 
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Call a set T of vertices of a graph c-clustered if for any x,y G T there 
are vertices x = xq, x\, . . . , Xk = y with dist(xj_i, Xi) < c for all i. The next 
lemma is from [20] (see Lemma 2.1); the interested reader should have no 
difficulty supplying a proof. 

Lemma 2.3 //£ is a graph on V and S, T C V satisfy 

(i) S is a-clustered, 

(ii) dist(x, T) < b Vx <E S and dist(y, S) < b Wy G T, 
then T is (a + 2b) -clustered. 

Finally, we need to know something about isoperimetry in Z d . Write \x\ 
for the £i-norm of x, and set B(r) = {x G Z d : \x\ < r}, S(r) = {x G Z d : 
\x\ = r}, b(r) = \B(r)\ and s(r) = ^(r)]. 

Lemma 2.4 Let C be a subset of Z d with 

|C| = b(r) + as(r + 1), 

where < a < 1. Then 

\dC\ > (1 - a)s(r + 1) + as(r + 2). 

This is an immediate consequence of a corresponding inequality for the torus 
(Z/kZ) d , given by Bollobas and Leader in [3, Cor. 5]. The case a = was 
proved by Wang and Wang [24]. 

2.2 To prove 

We assume henceforth that A satisfies (2). We prove only the first part of 
Theorem 1.2 ((3) for odd v ); switching "even" and "odd" throughout the 
argument gives the proof of the second part. 

It will be convenient to replace the box A« by the discrete torus T = Tm 
obtained from Am by setting M = —M and identifying vertices accordingly. 
Following our favorite abuse, we regard T as either a graph or a set of vertices 
as convenient. 

We then use A for the image of <9*Am under the natural projection Am >->■ 
T, and continue to write for the image of in T, and to use O and 8 for 
the sets of odd and even vertices of T. 
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Having done this, we replace 8*Am by A in the definition of J {J = 
{I C T : / independent, A fl £ C /}), define /i e M , fi° M as before, and simply 
regard Theorem 1.2 as referring to T, a change which clearly does not affect 
its meaning. 

We will show a bit more than (3): for I e J, let Z — Z{T) be the 
component of T — (I fl O) containing A; then 

S M (v * Z(I)) < (1 + A)-' 2 -*"" (4) 

Let Jo = {I e J : v & Z(I)}, and write w(I) for A |7 L We prove (4) by 
producing a "flow" v : J x J — > [0, 1] satisfying 

5>(J,J) = 1 V/GJo (5) 
J 

and 

£^^(/,J)<(l + A)-( 2 -°«) d VjG ^ ( 6 ) 
i W \J) 

This gives (4): 

E«M = 



< 

Throughout our discussion we fix vq and use / for members of Jq and J 
for general members of J . 

The definition of will depend on a pair (G, A) = (G(I),A(I)) G 

2 5 x 2° associated with /. The construction and salient properties of the 
pair are given in Sections 2.3 and 2.4, but it will not be until Section 2.11 
that we are able to specify v. First steps toward this specification are taken 
in Section 2.5, which finally puts us in a position — in Section 2.6 — to give 
some clue as to how the main part of the argument will proceed. 



£ w(I) £ u(I, J) 

I&Jo J&J 

Jej leJo ^ > 
(1 + A) -(2-o(i))d£ w(J)> 

Jej 
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2.3 "Contours" 

For a set P of vertices (in any graph) we use d*P for the internal boundary 
of P: 

d*P = {vE P\N(y) £ P}. 

The following observation is used several times, so we record it as a lemma; 
its easy proof is left to the reader. 

Lemma 2.5 Let E be a graph, S C V(E), and T (^f/ie vertex set of) some 
component o/E - (S \ d*S). Then d*T C 9*5. 

Let I e Jo, Z — Z(I) be as in Section 2.2, and set Z = d*Z. By 
the definition of Z, it is clear that Z C £ and Z C\ I — Let W' be the 
component of v Q in the graph r — (Z\Z ). By Lemma 2.5, <9*W^' C W'nZ C 
f. 

Let VT" = W U G C|iV(a;) C W'}. This is clearly connected, with 
d*W" C 

Now consider r — (W"\9*W"). This breaks into a number of components, 
one of which, C say, contains A. Again using Lemma 2.5, we have d*C C 
C n Finally, set W = V \ (C \ d*C), G = Wn£,A = WnO, and 

G = d*W. 

The next proposition collects relevant properties of these objects. Once 
we have these properties, we will not be concerned with how G, A etc. were 
derived from /. 

Proposition 2.6 

v EA; W n A = 0; (7) 

both C and W are connected; (8) 

Go = d*C; (9) 

G = N(A) and A = {x e 0\N(x) C G}; (10) 

G o n/ = 0; (11) 
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N(G ) n / C A; 



(12) 



G C N(A n I). 



(13) 



Proof. Both (7) and the connectivity of G are immediate. To see that W is 
connected, notice that each component of T — (W" \ d*W") must meet d*W" 
(or it would be a component of the connected graph T). Thus W is the union 
of the connected set W" and a number of other connected sets each of which 
meets W" , so is itself connected. So we have (8). 

For (9): d*C C W fl £ and the connectivity of G give 

x e d*G ^ nCccn0cc\if ^xe 

so <9*G C (9*W^; and Lemma 2.5 and the connectivity of W give the reverse 
containment. 

Connectivity of W and the fact that Go C £ give G = iV(A). That 
AC{xe 0\N{x) C G} follows from G = JV(A) (or just 9W C £). For the 
reverse containment, notice that x £ W =>■ JV(x) n If C G C W', whereas 
iV(x) C W would imply lerci^; so^lV^ N(x) % W. 

For (11) recall that G = <9*G C d*W" C 9*W C Z and Z n / = 0. 

That N(G )nl C A follows from G C d*W, since iV(<W)n/ is clearly 
contained in A. 

Finally, v G G =4> t> G Z v ~ /, so (13) follows from (12). 

I 

2.4 Topology 

The purpose of this section is to prove, for any I G Jo and W 7 , G etc. 
produced from I as in Section 2.3, 



Our proof of this, which is considerably longer than we would wish and 
unrelated to the methods in the rest of the paper, might profitably be skipped 
on a first reading. 



G is 2-clustered 



(14) 
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Though (14) turns out to follow from the connectivity of W and C (see 
(8)), we could not see a simple combinatorial proof of the implication, and 
our argument requires a little topological detour, based on 

Lemma 2.7 If U,V are connected subsets of X = R™ or S n , n > 1, with 
U U V = X , U closed and V compact, then U fl V is connected. 

(As usual, S n is the unit sphere {x G R n+1 : — 1}. We also write B n+1 
for the corresponding unit ball.) 

The (presumably well-known) proof of Lemma 2.7 is given at the end of 
this section. 

It will be convenient here to write fl for the nearest neighbor graph on 
Z d . As usual, fl[S] is the subgraph induced by S. We will prove (14) in the 
following more general form. 

Proposition 2.8 Let RU B be a decomposition ofV(fl) (= Z d ), with both 
fl[R] and fl[B] connected and R finite. Suppose G := R fl B is contained in 
£ and is the internal boundary of each of R,B. Then G is 2-clustered. 

Remark. We will actually show that G is 2-clustered in each of R and B. 

Proof With fl embedded in R d in the natural way, we extend R and B 
to closed connected subsets R* and B* of K d so that R* U B* = K d and 
G* := R* fl B* is path-connected. We then derive the 2-clusteredness of G 
from the path-connectedness of G*. 

We view R d as the union of Z d -translates of [0, l] d (the cells of R d ), 
and define R* and B* cell by cell. Within a cell we proceed by dimen- 
sion, first defining the extensions for 0-dimensional faces (the vertices of fl), 
1-dimensional faces (the edges of fl), and 2-dimensional faces, and then con- 
tinuing inductively. (As usual a face of a cell is the intersection of the cell with 
some supporting hyperplane. Henceforth we use "fc-face" for "/c-dimensional 
face.") For the inductive step, we need a topological lemma (Lemma 2.11), for 
the statement of which it's convenient to introduce two local definitions. Let 
us say that a subset of a topological space is civilized if it is closed, has only 
finitely many components, and each of its components is path-connected. 

Definition 2.9 A decomposition X = R U B of a topological space X , with 
Rn B = G, is nice if it satisfies: 
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(i) G = dR = dB; 

(ii) each of R, B, G is civilized; and 

(iii) each of R, B — and so each component of R and B — is the closure of the 
union of finitely many open, path- connected sets. 

If X = RU B is a nice decomposition, and R' , B' are obtained from R, B by 
adding finitely many points, then we also call the decomposition X — R' U B' 
nice. 

(Of course there is some redundancy in conditions (i)-(iii).) 

We say that two nice decompositions X\ — R\ U B\ and X2 = R 2 UB 2 are 
compatible if R-y n X 1 n X 2 = R 2 n X x H X 2 and B x n X x n X 2 = B 2 H X x n X 2 . 
It's straightforward to check that nice decompositions of different spaces can 
be combined if they are compatible: 

Lemma 2.10 Suppose X = X\U- ■ -UX m with eachXi closed. If Xi = RiUBi 
are pairwise compatible, nice decompositions, then (Ui?j) U (UBj) is a nice 
decomposition of X . 

We now state the topological lemma alluded to above, deferring its proof 
until after the derivation of Proposition 2.8. (Recall B n+l and S n are the 
unit ball and sphere in R n+1 .) 

Lemma 2.11 Assume n > 1. If RU B is a nice decomposition of S n , then 
there is a nice decomposition R*UB* of B n+1 , with R*nS n = R, B*(~)S n = B, 
and such that if C is any component of R* (resp. B* , G*), then C fl S n is a 
component of R (resp. B, G). 

(This is easily seen to fail for n — 1. It may be worth pointing out that for 
R and B, condition (iii) of Definition 2.9 refers to sets that are open in S n ; 
similarly dR and dB are boundaries relative to S n , while dR* and dB* are 
boundaries relative to B n+1 .) 

Of course Lemma 2.11 still applies if we replace the B n+1 by any of its 
homeomorphic images (and S n by the corresponding homeomorphic copy); 
in our case the relevant image will be [0, l] d . 

We now fix a cell, and begin defining our extensions. For vertices and 
edges we do the natural things: R* fl V(Q) — R, B* n V(Q) = B; and we put 
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(the interior of) an edge in R* (resp. B*) iff both its ends are in R* (resp. 
B*), noting that exactly one of these possibilities occurs, since V(G, G) = 0. 

Next, we deal with 2- dimensional faces. If the vertices of such a face 
are all in R (resp. B), then put the interior of the face in R* (resp. B*). 
Otherwise, the face has two opposite corner vertices (i>i,i>3, say) in G, with 
one of its remaining two vertices (1*2) in R\B and the other (1*4) in B\R. Put 
the interior of the convex hull of t>i,t>2,t>3 in R*, the interior of the convex 
hull of i>i,i>3,i>4 in B*, and the interior of the diagonal joining v\ and V3 in 
R* n B*. It is easy to check that these (R*, _B*)-decompositions of the 2- 
dimensional faces are nice. (It may be worth observing that a 2-dimensional 
face contained in R* may still have one or two of its vertices in B*, and vice 
versa.) 

We now proceed by induction, assuming the decomposition has been 
defined on faces of dimension less than k e {3, ...,d}. Each /c-face F is 
homeomorphic to B k , and is bounded by the union of finitely many (k — 1)- 
dimensional faces. The decomposition of each of these bounding faces is nice, 
and the decompositions on any two faces are compatible (since we are defin- 
ing the decomposition from lower dimensions up). So, by Lemma 2.10, we 
have a nice decomposition of the boundary of F. We now apply Lemma 2.11 
to extend to a nice decomposition of the entire face. Once we have a nice 
decomposition of each cell, we get the full decomposition H d = R* U B* by 
combining the decompositions of the cells, again appealing to Lemma 2.10 for 
"nice." (For formal applicability of the lemma, we can use a single Xi = Bi 
for the union of all cells not meeting R.) 

It is clear from the construction that R* and B* are closed, R* is bounded, 
and R*UB* = R d . To see that R* is connected, notice that by construction, 
any component of R* contains an edge of Q[R], and that every edge of Q[R] 
is contained in a component of R*; connectivity of R* then follows from 
connectivity of Q[R]. The same argument shows that B* is connected. 

Lemma 2.7 now shows that G* is connected, which, since G* is also civi- 
lized (since R* U B* is nice), implies that it is actually path-connected. 

It remains to show that path-connectedness of G* implies 2-clusteredness 
of G. It is enough to show that for each pair of vertices u,v G G, there is a 
path connecting them in G* which is supported entirely on the 2-dimensional 
faces of R d ; for, by the construction of R* and B*, such a path is supported 
on diagonals (of 2-dimensional faces) connecting pairs of vertices from G, 
and such diagonals correspond to steps of length 2 in Q. (This also justifies 
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the remark following Proposition 2.8.) 

So, consider a (u, i>)-path P in G* given by the continuous function / : 
[0, 1] — > H d . If P is supported on 2-dimensional faces of R d , then we are 
done. Otherwise, let k > 2 be the maximum dimension of a face whose 
interior meets P. It's enough to show that we can replace P by a path 
meeting the interiors of fewer /c-faces than P and no faces of dimension more 
than k. 

To do this, choose a fc-face F and component C of G*nF with CnT°nP ^ 
(where T° is the interior of F). Let p = ini{x G [0, 1] : f(x) G Cf]F } and 
g = sup{x G [0, 1] : /(x) G C fl T }. Then /(p), /(g) G C n 9F, which, by 
construction, is path-connected. So we may replace f(\p,q]) in P by a path 
contained in dF. 



Proof of Lemma 2.11 

To avoid confusion, we now write dX, d'X and d"X for the boundaries 
of X relative to, respectively, R n+1 , B n+l and S n . 

We may assume neither R nor B contains isolated points: otherwise we 
can simply delete such points, produce R* and B* for the resulting "reduced" 
R and B, and then add the deleted points of R (B) to R* (B*). 

We use (R, B)- component to mean a component of either R or B, and pro- 
ceed by induction on the number of (i?,B)-components in the decomposition 
ofS n . 

If there is exactly one such component (a component of R, say), then 
R = S n , and 5 = 0. Setting R* = B n+1 and B* = 0, we get a nice 
decomposition of B n+1 which satisfies the conditions of the lemma. 

Otherwise, there must be at least one (i?,-B)-component T for which 
S n \ T° is connected. For suppose S n \ T° is disconnected for every (R,B)- 
component T. Choose an (_R,_B)-component T (C R, say) such that one 
of the components of S n \ Tq , C say, contains as few (R, 5)-components as 
possible, and let T\ be an (R, 5)-component of C (i.e. contained in C, noting 
that each (R, 5)-component other than To is either contained in or disjoint 
from C). Now S n \C° is connected in S n \T®, so S n \T® (which by assumption 
is not connected) contains a component whose (i?,S)-components form a 
proper subset of the (i?,5)-components of C, contradicting the choice of To. 

Let T, then, be an (i?,£?)-component with S n \ T° connected. We may 
assume that T is a component of R. Applying Lemma 2.7 with X = S n , 
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U = T and V = S n \T°, we find that d"T is connected, so that T meets 
exactly one component, say C, of P (and C D d"T). 

Set T* = {\x : x G T, A G [1/2, 1]}. This will be one component of R*. It 
is easy to see that T* is closed and path-connected (so civilized), as is d'T*, 
and that T* D S n = T, a component of P. 

Now let (T*)° be the relative interior of T* with respect to P n+1 (namely, 
(T*)° = {\x : x G T°, A G (1/2, 1]}), P = <9(P ri+1 \(T*)°) (= (S n \T°)U<9'T*), 
and Q = B n+l \ (T*)°. Then (Q, P) is (easily seen to be) homeomorphic to 
(B n+1 ,S n ). 

Let, further, R 1 = R\T, B 1 = BU d'T*, and d = C U d'T*. Then 

(i) the components of Ri are precisely the components of R other than T, 

(ii) the components of B\ are Ci and the components of B other than C, 

and it is easy (if tedious) to deduce that R\ U B\ is a nice decomposition of 
P. 

Our inductive hypothesis thus gives a nice decomposition R\ U P^ of 
Q, and we obtain the desired decomposition, R* U P*, of P n+1 by setting 
P* = Pi and P* = Pi U T* (again an easy verification using (i) and (ii)). 



Proof of Lemma 2. 7 

We first establish a corresponding statement for open sets: if U, V are 
connected, open subsets of X = R n or S n , n > 1, with U U V = X, then 
U C\V is connected. 

Proof. We use the Mayer- Vietoris sequence. If X is a topological space, and 
U and V are open subsets of X whose union is X, then this is a long exact 
sequence of group homomorphisms ending with 



> Pi(X) -)■ P (P HV)^ H (U) © H (V) -)■ H (X) -)• 0, 

where P m is the m t/l homology group. We apply this with X = R n or S n . 
Using the facts that H m (R n ) = whenever m > 1 and that if O is an open 
subset of R n or S n , then H (O) = Z iff O is connected, this long exact 
sequence becomes 
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o ->■ h (u nv)^z©z^z^o. 

From the exactness of this sequence, it follows that H (U fl V) = Z, so 
that U fl V is connected. 

I 

Now let [/, V 7 be as in the lemma, and for each e > 0, set U £ = {x e X : 
d(x,U) < e} and 14 = {2 G 1 : V) < e}. These are open, connected 
sets whose union is X, so by the preceding result, U £ C\ V £ is connected. 
Thus U £ fl 14 is connected; it is also closed and bounded, so compact. So 
U f]V = n £>0 U £ fl V e is the intersection of a nested sequence of compact, 
connected sets, so is itself connected. 



2.5 Shifts and (pj 

We again fix I e iTo an d take W 7 , G, A etc. to be as in Section 2.3. 
For j G {±1, . . • , ±<i}, define <7j, the shift in direction j, by 

a j (v) = v + e j , 

where tj is the j th standard basis vector if j > and ej = — e_j if j < 0, and 
set 

Gl = {v e G : or-» ^ A} = Go n aj(0 \ A). 

Proposition 2.12 For each], the sets I\W , <jj(lnW) and Gq are pairwise 
disjoint, and their union is an independent set. 

Proof. Trivially, a^I) n / = 0, so in particular (/ \ W) n nW) = 
(I\W)nG J = is trivial (because G{ C W 7 ); and ff 3 (/n^nG 3 = follows 
from the definiton of Gq. So the union is disjoint. 

Clearly (I \ W), Gj(I fl W) and G J are all independent sets. To show 
independence of the union, we must show that there are no edges between 
any two of them. Since V(7 \ W, W) = (by (12)) and a^I nW)CW (by 
(11)), we have V((J \ W), fa (I DW)U G{)) = 0. 
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This leaves V(<Jj(I D W), G J ). Suppose, for a contradiction, that y G G 3 
and cr k (y) G a 5 {lr\W) for some k. Then z := cr" 1 ^^)) G iTWn£ C G\G 
(by (11)), implying crj l (y) = cr^ 1 (z) G A, contrary to the assumption y G Gq- 
So V(<Tj(lC\W),Gi) = 0. 

I 

Define a* (I) = (I \ W) U ^(J n W) and 

% (I) = {J:,;(/)CJCa;(/)UGj}. 
Then Proposition 2.12 implies 

ifjil) C J" . 

Notice also that we recover / from j, J (G <Pj(I)) and (G, A); namely, if we 
are given (G,A), j, and J G <fj(I), then 

/ = (j\^)u ( x7 1 (jn(^\G J )). (15) 

2.6 Conventions and preview 

Conventions 

In much of what remains we can ignore / and concentrate on pairs from 
G := {{G, A) G 2 £ x 2° : (G, A) satisfies (10) }. 

Notice that under (10) each of G, A determines the other. 

If (G, A) is produced from / as in Section 2.3 then we write (G(I), A(I)), 
noting that a given (G, A) may correspond to more than one /. 

We will always take W = G U A and G = d*W (a subset of £ because 
of (10)). 

Set £ = 2d; so T is an ^-regular bigraph. (We tend to think in terms 
of d and use £ sparingly, for instance usually preferring 0(d) to the equiv- 
alent 0(£).) Though we usually work in T, we sometimes — especially in 
Section 2.9 — consider more general graphs S, always assumed to satisfy 

E is an ^-regular bigraph with bipartition V — O U £ . (16) 
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We always take |G| = g and \A\ — a — (1 — S)g, and for given g, S set 
G(g, 5) = {{G,A)eQ: \G\ = g, \A\ = (1 - 5)g}, 

Jig, 6) = {lej : (G(I), A(I)) e Gig, 5)}. 

(It's generally best to think of 5 as small, though it will not always be so.) 

As will appear, the quantity that really matters is almost always 5g (= 
|G| — \A\), and it will be convenient to take, for any t, 

G(t) = {(G,A)eG:\G\-\A\=t}. 

Notice that for (G, A) e Git), 

\V(W,V\W)\ (=\V(Go,0\A)\) =U. (17) 

Though we don't really need t, we use it to emphasize a certain duality: 
if (G, A) G Git) in some graph E satisfying (16), then (0\A,£\G) belongs 
to the analogue of G (t) obtained by reversing the roles of O and £ in E — but 
of course g and 5, unlike t, are not usually preserved by this switch. 

Preview 

Our tasks are to define u, for which (5) will turn out to be obvious, and 
establish (6). 

We will eventually associate with each (G, A) a particular index j = 
j{G, A), and set j(I) = j(G(I), A(I)). (This is basically a j for which \G J \ — 
log 2 \ifj(I)\ is large, though there are some additional considerations.) We 
then define <p(I) = ip.^Al) and require 

J &<p(I)=>v(I,J) = 0. (18) 

Let us call / small if \G(I)\ < d 3 (we could get by with d 9 ^ 4 ; see (68)), 
and large otherwise. 

For small I — an easy case, as we will see in Section 2.13 — we simply 
choose j = j(I) to maximize \G J \ (where G = G(I)), so that, since 

Y,\Gl\ = \V(G,0\A)\=5g£, (19) 

j 

we have 

\Gl\ > Sg. (20) 
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We then set 

v(I, J) — A' J ' _ ' 7 '(1 + A)~' G °' VJg </?(/). (21) 

(Note this satisfies (5). The separate treatment of small I is unnecessary if 
we only want the phase transition, but is needed for the "correct" bound in 
(3)0 

Most of our work (including everything in Sections 2.4 and 2.8-2.12) is 
geared to large I (though often valid in general). For most of our discussion 
we fix (g,S), and aim to bound the contribution of J(g,5) to (6). Of course 
these contributions must eventually be summed, but this turns out not to 
add anything significant. 

Before beginning in earnest, we pause in Section 2.7 to adapt the isoperi- 
metric Lemma 2.4 to our situation (Lemma 2.13). This is needed especially 
in Section 2.13, but will also make an appearance in Section 2.8. 

In Sections 2.8-2.10 we associate with each relevant (G, A) some (F, S) G 
2 £ x 2° which "approximates" (G, A) in an appropriate sense. The definitions 
of and •) (in Section 2.11) are then based on our approximation to 
(G(I), A(I)). The main points are: (i) the set of possible approximations 
is small (Lemma 2.18); and (ii) for a given J, J's for which (G(I), A(I)) is 
approximated by a particular (F, S) don't contribute too much in (6) (see 
(53)), construction of a v achieving this being made possible by the accuracy 
of our approximations. 

The proof that v behaves as desired (that is, of (53)) is given in Sec- 
tion 2.12, and Section 2.13 is a mopping up operation, combining what we 
already know for large J's with the easy analysis for small J's and the isoperi- 
metric information from Lemma 2.13, to finally establish (6). 

More conventions 

For whatever G, A, F, S we have under discussion, we set H = £ \ G, 
B = 0\A, E = £\F, T = 0\S, B = B n N(G), S = S n N(E), and 
E = EDN(S). 

From now until Section 2.13 we fix g, 5 and always take I G ^T{g, 5) and 
(G,A) G Q(g,5). (We will not see I again until Section 2.11.) 

2.7 Isoperimetry 

Before continuing, we need to work out what Lemma 2.4 implies in the way 
of a lower bound on 5 for given g. 
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Lemma 2.13 Suppose (G,A) G G(g,S) satisfies 

(G U A) n A = 0. (22) 

Then 

j CHg-W/d) for all g 
~ { l-0(l/d) i/^< d°W. 

(For the (G, A)'s of interest to us, (22) is given by (7).) 

Proof. In view of (22), the lemma does not change if we replace the torus T 
by the box A. 

For the first part of the lemma, the main thing we have to show is 



Proposition 2.14 s(r) = Q(b(r 



i-i/d\ 



(where B(r), S(r), b(r), s(r) are as defined before Lemma 2.4). Notice that 
this, combined with Lemma 2.4, implies that for any C C Z d , 

\dC\=Q(\C\^ 1)/d ). (23) 

Proposition 2.14 is again something for which one would hope to just 
give a reference; but we could not find one, or even give the short proof that 
seems called for. 

For the proof, we'll be interested in the average number of nonzero entries 
in an element of S(q), 

t(q) := siqy 1 ^ \supp(x)\. 

xeS(q) 

This is useful because, setting 

N(q) = \{(x,y)eS(q)xS(q + l):x~y}\, 

we have 

s(q)(2d - t{q)) = N(q) < s(q + 1) min{g + 1, d}, 

implying 

s(q) < mm{q + l,d} ^ 



s(q + l)~ 2d-t(q) 
20 



This already implies Proposition 2.14 for, say, r < .9d, since in this case 
we have 

OM < .W ± p^i)" £ g fc)' = 

For larger r we will have to work harder. Here we first show, for q = (3d 
with f3 > .9, 

t(q) < (1 - l/(20/3))d. (25) 

Let 

^(g,*) = e ^(g) : |supp(x)| = t}, 
s(q,t) = \S(q,t)\, and define B(q,t) and b(q,i) similarly. Then 

mt) - s(q,t) ~ 2 (t + l)t ■ 
Set t = t (q) = \(1 - l/(4/3))d]. Then t > t implies 

( (l/(4/3))(/3- 1 + 1/(4/3)) 



/(?,*) < 2- 



(1 - 1/(4/5))= 



Us-lJ 2 



Thus 



t<q 



< i + 2 i2_i = *o + 2. 

i>l 

This gives (25) provided /3 < d/15. For larger /3 we just use 

s(q, d — 1) d(d — 1) d — 1 
s(g,d) = 2(f3d-d+ 1) > 2/3 ' 

whence 

d-t(q) = s(qr 1 J2(d-t)s(q,t)>J2s(q,t)/(J2s(q,t)) 

i<d i<d 

> s(q, d - l)/(s(q, d-l) + s{q, d)) > (d - l)/(2/3 + d-l), 
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which again gives (25). 

Now let r = 7c? > .9d. By (25) and (24) we have, for r — i > .9d, 

* - *"> £ " f »n i+J /w,-a) < s(r)(1 - Si(1/7)) " 

so 

r- .9d 

6(r) < s(r) ^ (1 - 0(1/7))* + b(.9<£) = 0( 7 s(r)) (26) 

i=0 

(since we know b(.9d) = 0(s(.9d)) = 0(s(r))). 
On the other hand, with to = to(r), we have 

b(r) > b(r,t ) = 2*° ([J > exp[t log(r/t )], 

and 6(r) 1/d > exp[(l - l/(4 7 )) log(r/t )] = ^(t); and this with (26) gives 
Proposition 2.14. 

I 

Now for the first part of Lemma 2.13, we consider the possibilities \G \ > 
\A\ and \Go\ < \A\ separately, in both cases using the fact that \Go\ < Sgd 
(since |G | < |V(G, G\A)\ = Sgd). 

If |G | > \A\, then 5 > l/{d+ 1), so certainly 5 = n(g-V d /d). If, on 
the other hand, \G \ < \A\, then we have (using (23) and the fact that 
d((G\G )UA)=G ) 



5 > \G \/(dg) 

= mG\G )UA\^ i y d /(dg)) 
= Vt{\G\^l d /{dg)) 

= ^(g- 1/d /d). 

For small g notice that for r < 0(1), 

s(r) = 2 r d r /r\ + 0(d r - 1 ), 
which in view of Lemma 2.4 implies that for C C Z d with \C\ < d°^\ 

\dC\ = Q{\C\d). 
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Applying this with C = W \ G gives |G | = (1 - 0(l/d))g. But then 
|V(G , A)\ < e\A\ = O(\G \) implies 

5g£ = |V(G , 0\A)\>{£- O(l))\G \ = £{l - 0{l/d))g. 



2.8 First approximation: covering the boundary 

Say a set C C T separates P, Q C T if any path meeting both P and Q also 
meets C. 

In this section we begin the process of approximation by showing that 
there is a "small" collection of subsets of T, at least one of which separates W 
(= G U A) and r \ W for each relevant (G, A). We then use these separations 
to show that there is a small S C 2 £ x 2° such that each of our (G, A) J s is 
approximated by some (F, S) G S in the sense that 

F C G (27) 

and 

|S\ A|,|G\F| < 0{5g^d\ogd ). (28) 

This is stated formally in Lemma 2.16 at the end of the section. 
Our argument applies to pairs from 

Q* := {(G, A) G G(g,8) : (G, A) satisfies (7) and (14)}, 

though the main point, Lemma 2.15, is valid for all of Q{t). 

In this section (unlike in the next) we make substantial use of properties 
particular to T, specifically the isoperimetric properties given by Lemma 2.4 
and 



V w ~ v and L C N(v), \N(w) n JV(L)| > |L| (29) 

(which follows from the fact that for vertices v ~ w, T[(N(v)UN(w))\{v, w}] 
is a matching of all but one vertex of N(v) and all but one vertex of N(w)). 
Let 

G' = {v G G : ^(u) < £/2} (C Go), 
B' = {v e B : d H (v) < £/2} (CB ), 



23 



Gq = G \ G and B% = B \B' . Then 

V(GS, B ff ) = 0. (30) 

(The more general statement here is: if v G Go, w G B Q and i> ~ w, then (by 
(29) with L = N(v) n A) d G (w) > d A {v) (= £ - d B (v)), implying d B (v) + 
d G (w) > £.) 

Notice that (30) implies 

G' U B' separates W and T\W (31) 

(equivalents, V(W, r \ W) C V(G ) U V(5 )). 

Lemma 2.15 In any graph satisfying (16) and (29), for any (G,A) G (?(£), 
there exists U C iV(G U 5 ) satisfying 

N(U) DG' UB' (32) 

and 

\U\<0(tyfiogIJe). (33) 

Before proving this, we observe that it does accomplish the first goal 
stated at the beginning of this section (existence of a small set of separations). 
For (G,A) and U as in Lemma 2.15, we have 

N(U) separates W and T\W (34) 

(by (31) and (32)). So we just need to limit the number of possibilities for 
U when (G, A) G </*. 
To do so, notice that 

U is 6-clustered. (35) 

This follows from Lemma 2.3 and (14), once we observe that dist(w, Go) < 
2 Vn G U (since U C ^(GqU^o)), and that (32) and (30) imply dist(n, £/) < 
2 Vn G G . 

In view of (33) (with t = 5g), Lemma 2.1 then gives, for example, a bound 
0(gd 2 )(Cd 6 )°^V^T d ) = exp[O(0£d- 1/2 log 3/2 d)] (36) 

on the number of possibilities for U. Here we used Lemma 2.13 for the equal- 
ity in (36). The initial 0(gd 2 ) corresponds to a choice of Xq in Lemma 2.1: 
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in view of (7), there must be some j G [—d, d] \ {0} and k < gj (2d) for which 
Ho '■= vo + (2k — l)ej G Go; there are at most g possibilities for this yo, so at 
most 0(gd 2 ) possibilities for a vertex x with d(x ,y ) < 2; and by (32) and 
(30) U must contain such an x . 

Proof of Lemma 2. 15. 

By "duality" (see Section 2.6) it's enough to show the existence of S C 
N(G' ) with 

N(S) D G' (37) 

and 

\S\ < 0(t^\og£/£ ). (38) 

Define Q = {v G G : d A (v) < ^/T\ogZ}, K = G \Q, and P = N(Q) n A. 
By (29), 

d Go (v) >£- G P. (39) 

Let P' = {veP: d K (v) > £/2}, P» = P\P', Q' = Qn N(P'), Q" = Q\ Q' 
and R = {v G B n N(G' ) : d Go (v) > V?lo g7}. 

Now P" is a cover of Q" of size 0(t^J\og£/£ ), the size bound follow- 
i ng from \Q\ < U/(£ - y/Hogl ) = 0(t) (using (17)), dp»(v) < d A (v) < 
y/Uogl G Q, and d g (u) > 1/2 - v^logl G P" (using (39) and the 
definition of P"). 

On the other hand, we can cover G' \ Q" by a similarly small subset of 
P, as follows. From (29) we have N(K) n iV(G ) n B C P. This gives 
dfl(u) > V 2 for u e G o \ Q, while for u e Q', 

d R (v) > \N(v) n A^(P)| - \N(v) C\A\> e/2 - \j£\ogl 

(the second inequality following from (29) and the definitions of Q' and Q). 
So, noting that |P| < t\jlj log£ (again using (17)), Lemma 2.2 says that 
we can cover G' \ Q" by some T C R of size at most |P|(1 + \og£)/(£/2 — 
v^logl ) < 0(t^J\og£/£ ). (And note P C N(G' ) since Q C G' , and 
P C iV(G{,) by definition, so S := P" U T C iV(G ).) 



We now return to T. Given U as above, let us temporarily set L = N(U). 
Then \L\ = 0(5g^d\^g~d ). 
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Say a component C of T — L is large if |C| > d and sraali otherwise. 
Lemma 2.4 implies 

\V(C,L)\ = \V(C)\>\dC\=n(\C\d) 
for small C (actually also for considerably larger C), and 

|V(C,L)| = Q(d 2 ) 
for large C. But |V(L)| < 2d|L| = O^ad^Vlogd ), so 

the number of large components is 0(5gd~ 1 ^ 2 y/\ogd ), (40) 

and the number of vertices in small components is 0(5 g^d logd ). 

It follows that if (G, A) is any pair satisfying (10) for which L separates 
W and T \ W, then we satisfy (27) and (28) with 

F = P n E and S = (P U Q U L) n C, (41) 

where P is the union of those large components of T — L that meet (equiva- 
lently, are contained in) W, and Q is the union of (all) the small components. 
In particular this is true if (G, A) is any pair from Q* for which Lemma 2.15 
applied to (G, A) produces U. 

By (40) the number of possibilities (given L) for (F,S) as in (41) is at 
most eyL\)[0(5gd~ l l 2 yJ\og d )], and combining this with the bound (36) on the 
number of C/'s we have 

Lemma 2.16 There exist S C 2 s x 2° u^/j 

|<S| < exp[0(^rf- 1/2 log 3/2 d)} (42) 

and a map 7Ti : ^* — >■ S such that (27) ana 1 (28) hold for each (G, A) e ^* 
and (F,5) = tti(G,A). 

2.9 Second approximation 

The discussion in this section is valid for any graph X satisfying (16). It 
may be worth reiterating that we follow the conventions given at the end of 
Section 2.6. 
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Given (F*, S*) G 2 £ x 2° and a positive x, write Q' = G'(F*,S*,x) for 
the set of (G,4)'s in 0(t) satisfying (27) (with (F*, 5*) in place of (F,S)) 
and 

15* \ A|,|G\F*| < x. (43) 

Lemma 2.17 W^/i notation as above, for any < ip < £, there exist T C 
2 £ x 2°, 

|T| <exp[0((x/£) + (t/^))log£], (44) 

and a map 7r 2 : Q' — > T snc/i t/iat /or eaca (G, A) G (?' and (F, 5) = ^(G, A) 
we nave (27) and 

t) £ S 4- dp(v) > £ — ip, v G E d T (v) > t- ip (45) 

(where as usual E = £\F and T = 0\S). 

Remarks. We only need Lemma 2.17 when (F*,S*) G S (with S as in 
Lemma 2.16), in which case we take t — 5g and x = 0(5g^d\ogd ) (with 
an appropriate constant), so that Q' D 7rf 1 (F*, S*); but the extra generality 
costs us nothing. The pairs we produce will satisfy S C S* and F D F*, but 
we don't need this in what follows. 

Proof of Lemma 2.11 

We would like to exhibit a procedure which, for a given (G, A) G Q', 
outputs a pair (F, S) satisfying (27) and (45), and show that the set T of 
pairs produced in this way is small. 

We produce (F, S) via a sequence of modifications, initializing at (F, S) = 
(F*, S*). Note that whenever we update (F, S), we also automatically update 
E,T, etc. 

One preliminary observation: 

\S*\,\E*\ <x + £x (46) 

(since S£ C (S*\A)UN(G\F*), and similarly for F*; recall S£ = S*f)N(E*) 
and E* = E* n N(S*), where E* = £ \ F*). 

Stage 1A Set f = £/2. 

(A.l) Repeat for as long as possible: choose w E H with ds(w) > £ and do 
S<-S\N(w). 
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(A. 2) When no longer possible, do F <— F U {w G £ : ds(w) > £}. 
Stage IB Do the same thing in the dual; that is, 

(B.l) for as long as possible, choose w G A with cLe{w) > £ and do F f- 
FUN(w), and 

(B.2) when no longer possible, do S <— S \ {w G O : d E (w) > £}. 

Notice — a crucial idea — that (F, S) produced by Stage 1 does satisfy (27). 

Analysis: 

The output (F, S) of Stage 1 is determined by the sets of u>'s used in 
(A.l) and (B.l). 

Since each iteration in (A.l) shrinks \S\ by at least £ while maintaining 
ACS', the number of iterations is less than x/£ = 2x/£. Moreover, each w 
used in (A.l) lies in N(Sq). So the number of possibilities for the set of iu's 
used in (A.l) is less than J2i<x/z 

(using (46)). 

At the end of (A. 2) we have w G G \ F =4> d T (w) > £ - £ = £/2, which, 
since |V(G,T)| < U (see (17)), gives \G \F\ < It. 

Similarly, the number of choices for the set of iu's used in Stage IB is at 
most exp[0((x/£)\og£)] (note Stage 1A does not increase Eq), and at the 
end of this stage we have \S \ A\ < 2t. 

Stage 2 now repeats Stage 1, starting with the revised (F, S), using ip in 
place of £, and replacing (43) and (46) by 

|S\A|,|G\F| < 2t 

and 

\S \,\E \<2t(l+£). 

This clearly produces an (F,S) satisfying (27) and (45). Moreover, re- 
peating the analysis above, we find that the number of possible outputs 
of Stage 2, for a given output of Stage 1, is at most exp[0((t/ip)log£)]. 
So the number of possible outputs of the entire procedure is no more than 
exp[0((x/£) + (t/^))\og£}. 
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2.10 Status 



We now specify t — 5g and x = 0(5g^d logd ) (the bound in (28)), and 
tp = \fd (any ip G (Q(^Jd/ log d), 0(^d log d)) would do; see the remark 
following (62).) Specializing to these values and combining Lemmas 2.16 
and 2.17, we have 

Lemma 2.18 There exist U C 2 s x 2°, 

\U\ < exp[0(^- 1/2 log 3/2 d)], (47) 

and n : Q* — > U such that (27) and (45) hold for each (G, A) and (F, S) = 
n(G,A). 

(The expression in the exponent in (47) is the maximum of the corresponding 
expressions from (42) and (44).) 

Now consider some (F, S) G U. Notice that, for any (G, A) G 7r _1 (i 7 ', S), 
Q := So U E contains all vertices whose locations in the partition T = 
G U H U A U B are as yet unknown; namely, we have 

F C G, TCB, S\S CA, E \ E C H 

(the first two containments are just (27); S \ S C A follows from F C G, 
(10) and the definition of So, and E \ Eq C H is similar). 

By convention, whenever we are given an (F,S), we take Q to be as 
defined in the preceding paragraph, and write Tq for the subgraph induced by 

Q. 

2.11 Flow 

Here, finally, we define v (for large J; for small /, see Section 2.6). 

Throughout the section we fix (F, S) G U. It is now convenient to write 
G ~ (F, S) if tt(G, A) = (F, S) and / ~ (F, S) if G(I) ~ (F, S). 

To define z/(J, •) for / ~ (F, S), we first need to choose a direction j = 
j(J). Fix such an / and let G = G(I), A = A(I), etc. The choice of j will 
depend only on (G, A). Observe that (using (45)) 

MS n A) n e \ = \v(S nA,GnE )\ < \GnE \tp, 

j 
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J2Wj\E )n(S \A)\ = \V(E ,S \A)\ < \S \A\1>. 

j 

But (45) and (17) imply \G n E \ + \S \A\< 5g£/(£ - tp), so that 

MSo) n Sol = E(l^( 5 onA)nEo| + |a7 1 (Eo)n(S'o\A)|) 

< 5g£ip/(£-ip). (48) 
We assert that we can choose j so that 

|Goi > -85fl (49) 

and 

\aj(S ) n Sol < 10\Gi\iP/£. (50) 

To see this, let 

P = {JE [-d,d] \ {0} : \a 3 (S ) n s | > io\Gi\^/£}. 
Then (48) gives 



T,\Gi\< TK jXM s °) r]E °\< 5 9 



l;2 



jeP 

so (using (19)) 



10^' n uy Ul y 10(£-^)' 



£|GSI>(l-^/(10(€-^)))^€. 



So there exists j ^ P with (say) |G | > .85g, which is what we want. 

I 

Having chosen j satisfying (49) and (50), we turn to defining •). Let 
C = C j (I) =GjnFn aj(S ) (= ^(^o \ A) n F), 

D = D*(I) = G J n(a,(T)U(a,( 1 So)nSo)). 

Then 

C U D is a partition of G . (51) 
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Setting a = a(X) = A/(l + A) 2 and (5 = /3(A) = 1-aX = (1 + 2A)/(1 + A) 2 , 
define 

r (a\y cnj \^ c \ j \(\/(i + A))i Dnj i(i + \y\ D \ j \ 

J) = = ^al^l/^VIa + A)-I°l if J G y,,(J) 

otherwise. 

Then 

5>(J,J) = 1 V/ (52) 
j 

(because of (51)). On the other hand we will show, for any J, 

E 4t^)<^ /2 - (53) 

2.12 Proof of (53) 

We need one easy lemma. Given a bigraph S on PUfi and U C i?, say that 
a (vertex) cover X U L U M of £ with C P, L C [/ and M C i? \ [/ is legal 
(with respect to U) if it is a minimal cover and 

K = N(U\L). 

(Note minimality implies K = N(R \(LU M)).) 

Lemma 2.19 notation as above, let K U L U M be a legal cover with 

\K U L\ as small as possible. Then 

(&)\/K'CK \N{K')n(U\L)\>\K'\, 

(b)VL'CL \N(L')\K\>\L'\. 

Proof, (a) Given K' C X, let 5 = N(K') f](U\L), 

K" = {veK : JV(u) n(/CSUl} (D X'), 

and T = n (R \ U). Then 

(i) (K \ K") U (L U S) U (M U T) is a minimal cover 

(a straightforward verification using the fact that each vertex of K \ K" has 
a neighbor in U \ (L U S)), and 
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(ii) K\K" — N(U\(LUS)). 

Minimality of \K U 7| thus implies \K \ K"\ + \L U S\ > \K\ + |7|, so 
\S\ > \K"\ > \K'\. 

(b) This is similar. Given 7' C 7, let W = N(L') \ K and 
L" = {u G L U M : N(u) CKUW} (D V). 

Then 

(i) K UW U ((L U M) \ L") is a minimal cover, and 

(ii) KUW = N(U\(L\L")). 

Minimality of \K U L\ thus implies \K U W\ + \L \ L"\ > \K\ + \L\, and 
\W\ > \L"\ > \L'\. 



Proof of (53). 

Given (F, S), J and j, set 

Z* = I*(F, 5, J, j) = {I ~ (F, 5) : j(J) = j, J G ^-(7)}. 

We will show 

which of course gives (53). 

Set U = a' 1 (J) n 5 . Suppose 7 G X*, and set G = G(I), A = A(I), and 

K = K(I) = Gn E Q , L = L(I) = U\A, M = M(7) = (S \U)\A. 

Then K U 7 U A7 (= (GUB)n Q) is a minimal cover of Y Q . (That it is a 
cover follows from (10); for minimality, notice (e.g.) that each v G G fl E 
has a neighbor in A, which must be in So (using ACS and the definition of 
So).) Moreover, we assert, 

K = N TQ (U\L). (54) 
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Proof. We show that each side of (54) contains the other. The obvious 
direction is 

n Tq (u \l) = n Tq (u r\A)c n(A) n e = G n e = k. 

For the reverse containment, suppose v G K. Since K C G , (13) says that v 
has a neighbor m £ A fl 1. Then u E S (because v E E ^ S \ S ), implying 
u G U (since -u G A fl / ^> <7j(u) G J). And of course u £ L (since «G A). 



Thus K U L U M is a legal cover of Tq with respect to U in the sense of 
Lemma 2.19. 

Now fix _K" UL UM , a legal cover of Tq with respect to U with |i^ UL | 
as small as possible. 

Given I G X*, let K = K(I) etc. be as above and set K' — K \ K, 
V = Lq \ L. Then by Lemma 2.19, 

\L\>\K'\ + \L \L'\, \K\>\L'\ + \K \K'\. (55) 

Furthermore, we assert, 

K = (K \K')UN Tq (L'). (56) 

The point of this is that it says that (K f ,L') determines G (so also A), and 
therefore I £ I* (because of (15)). 

To see (56), just observe that the only point requiring proof is K \ K C 
N-p Q (L \ L), and that this follows from (54) once we notice that V(K \ 
K , U\(L U L)) = (since K U L covers V(£ , U)). 

Now with C = C*(I), D = -D J (J) as in the discussion preceding (51), 
observe that 

Cf] J = <T j (L\aJ 1 (E )) and C\J = Oj{M \ a~ 1 (E )), 
and that we may partition D as 

D = (aj(T) n F) U (K \ a,(S \ (L U M))). 
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Thus, with inequalities justified below, 

W ( J ) v (j j\ = a k,(iV7 1 (So))| /3 k 3 (MV- 1 (i? ))| 
w(J) 



. (i _|_ x)~(\^( T ) nF \+\ K \^( s °\( LuM ))\) 

< a |L| /3 |M| (l + \y(\K\+\^(T)nF\) 

. a -{\a j (S nA)nK\+\aT 1 (E )n(So\A)\) ^ 

< a l^(l + A)-l x l / 3l G ol-(l^l+l^l)a-o(|G^ |^) (5g) 

< p6 9 /2 a \L\ {l + x) -\K\ rm+ \L\) (5g) 

f \l + 2Xj \l + 2Xj 

( 1 _1_ \ \ l^'l + l^oW'l / x \ |A-'|+|L \L'| 

* ^ 2 (^) (t^a) (») 



1 + 2X1 \l + X 



(In (57) we used a 1 = max{a x , /3 x , 1 + A}; in (58) we used G J C aj(L U 
M)U KU (<Tj(T) n F), (1 + A)" 1 < /3 and (50); (59) is from (49), using 
(ip/£) log(l/a) = o(log(l//3)), which is a consequence of 

A 2 = ^((^)log(l/A)) (61) 

for small A, and easily verified when A is larger; and (60) comes from (55).) 

Thus, recalling — see the remark following (56) — that each (K', V) corre- 
sponds to at most one I G I*, 



= /3 Sg/2 . 
As noted earlier this gives (53). 
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2.13 Finally 



Now fixing J e J , we are ready to verify (6) (thus completing the proofs of 
Theorems 1.2 and 1.1). 

Note first of all (referring to (47)) that for A < 2 (say) (53) implies 

ieJ(g,S) (F,s)eu i~(f,s) w y j I 

< \U\l{3 &9/2 

< £exp[{0(tT 1/2 log 3/2 d) - n(X 2 )}5g] 

< exp[-Q(\ 2 5g)], (62) 



while for larger A, 



E ^Ky)<A-*». (63) 



Remark. Our choice of ^ was constrained by the demands of (61) and (62) 
(the latter since ip = o(^Jd/ log d) would give — via (44) — a larger bound in 
(47)). 

We first deal with large J's (recall / is large if |G(/)| > rf 3 ). Here we have 
already done the work: Assuming first that A < 2, and with justifications to 
follow, we have 



E 

I large 



w(I) 
w(J) 



= EE E 



w(J) 



u(I, J) 



(64) 



g>d 3 <5 l€l(g,S) 
g>d 3 S 

< E Y,{eM-^)] : i > ntd-'g 1 - 1 ^)} (65) 

9>d 3 

< Y. e M-V(\\d- l g l - l / d ))] (66) 

9><P 

< exp[-fi(A 2 rf 3(1 - 1/d) - 1 )] (67) 

< exp[-w(Ad)]. (68) 
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Of course sums involving 5, are restricted to 5 for which Sg is an integer. The 
main inequality (64) is just (62), and (65) comes from Lemma 2.13. In (66) 
we have absorbed a factor A~ 2 in the exponent. One way (probably not the 
most natural) to see the inequality in (67) is to use 

(1 - ey 1 -* < (1 - e) iKl ~ 6 for i^'^K < g < (i + if'^K 

with K = d 3 , 5 = 1/d and 1 - e = expf-ft^cT 1 )]. 
For A > 2 a similar analysis (using (63)) gives 

/ large ^ ' 

Finally we turn to the easy case of small /. Here we abuse our notation 
slightly and set 

J(g,a) = {IEJ : \G(I)\ = g, \A(I)\ = a}. 

For a (nonempty) J~(g,a) with g < d 3 , Lemma 2.13 gives a = 0(g/d), so 
that, since each A(I) is 2-clustered and contains Vo, Lemma 2.1 bounds the 
number of possibilities for A(I) with I e J{g, a) by exp[0((g/d) logo?)]. 

But we also know (see (15)) that, given J and j, / G (pj 1 ^) is determined 
by G(I) (or A(I)), and that (by (21), (20), and again Lemma 2.13) 

^-Mi,J) = (i + A)-W)i 

w(J) 

< (1 + A)^ 

= (i _|_ ^-(i-o(i/d)) fl> 

So finally, noting that A(I) ^ implies \G(I)\ > I, we have 

E ^TtA^ j ) < ^MO((g/d)\ogd)](l + 
< (1 + A)~ (1 ~ o(1))9 

and 
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E ^(^) = E E E ^(«) 

/ small y > £<g<d3 a<g lej(g,a) W \ J ) 

l<g<d? 

< (l + A)^ 1 - ^; 
and combining this with (68) or (69) gives (6). 
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